153 research outputs found

    Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data

    Get PDF
    Use of socially generated "big data" to access information about collective states of the minds in human societies has become a new paradigm in the emerging field of computational social science. A natural application of this would be the prediction of the society's reaction to a new product in the sense of popularity and adoption rate. However, bridging the gap between "real time monitoring" and "early predicting" remains a big challenge. Here we report on an endeavor to build a minimalistic predictive model for the financial success of movies based on collective activity data of online users. We show that the popularity of a movie can be predicted much before its release by measuring and analyzing the activity level of editors and viewers of the corresponding entry to the movie in Wikipedia, the well-known online encyclopedia.Comment: 13 pages, Including Supporting Information, 7 Figures, Download the dataset from: http://wwm.phy.bme.hu/SupplementaryDataS1.zi

    Emergence of world-stock-market network

    Full text link
    In the age of globalization, it is natural that the stock market of each country is not independent form the other markets. In this case, collective behavior could be emerged form their dependency together. This article studies the collective behavior of a set of forty influential markets in the world economy with the aim of exploring a global financial structure that could be called world-stock-market network. Towards this end, we analyze the cross-correlation matrix of the indices of these forty markets using Random Matrix Theory (RMT). We find the degree of collective behavior among the markets and the share of each market in their structural formation. This finding together with the results obtained from the same calculation on four stock markets reinforce the idea of a world financial market. Finally, we draw the dendrogram of the cross-correlation matrix to make communities in this abstract global market visible. The dendrogram, drawn by at least thirty percent of correlation, shows that the world financial market comprises three communities each of which includes stock markets with geographical proximity

    Mapping the UK Webspace: Fifteen Years of British Universities on the Web

    Full text link
    This paper maps the national UK web presence on the basis of an analysis of the .uk domain from 1996 to 2010. It reviews previous attempts to use web archives to understand national web domains and describes the dataset. Next, it presents an analysis of the .uk domain, including the overall number of links in the archive and changes in the link density of different second-level domains over time. We then explore changes over time within a particular second-level domain, the academic subdomain .ac.uk, and compare linking practices with variables, including institutional affiliation, league table ranking, and geographic location. We do not detect institutional affiliation affecting linking practices and find only partial evidence of league table ranking affecting network centrality, but find a clear inverse relationship between the density of links and the geographical distance between universities. This echoes prior findings regarding offline academic activity, which allows us to argue that real-world factors like geography continue to shape academic relationships even in the Internet age. We conclude with directions for future uses of web archive resources in this emerging area of research.Comment: To appear in the proceeding of WebSci 201

    Editorial: At the Crossroads: Lessons and Challenges in Computational Social Science

    Get PDF
    The interest of physicists in economic and social questions is not new: during the last decades, we have witnessed the emergence of what is formally called nowadays sociophysics [1] and econophysics [2] that can be grouped into the common term “Interdisciplinary Physics” along with biophysics, medical physics, agrophysics, etc. With tools borrowed from statistical physics and complexity science, among others, these areas of study have already made important contributions to our understanding of how humans organize and interact in our modern society. Large scale data analyses, agent-based modeling and numerical simulations, and finally mathematical modeling, have led to the discovery of new (universal) patterns and their quantitative description in socio-economic systems..

    The Digital Flynn Effect: Complexity of Posts on Social Media Increases over Time

    Full text link
    Parents and teachers often express concern about the extensive use of social media by youngsters. Some of them see emoticons, undecipherable initialisms and loose grammar typical for social media as evidence of language degradation. In this paper, we use a simple measure of text complexity to investigate how the complexity of public posts on a popular social networking site changes over time. We analyze a unique dataset that contains texts posted by 942, 336 users from a large European city across nine years. We show that the chosen complexity measure is correlated with the academic performance of users: users from high-performing schools produce more complex texts than users from low-performing schools. We also find that complexity of posts increases with age. Finally, we demonstrate that overall language complexity of posts on the social networking site is constantly increasing. We call this phenomenon the digital Flynn effect. Our results may suggest that the worries about language degradation are not warranted

    Circadian patterns of Wikipedia editorial activity: A demographic analysis

    Get PDF
    Wikipedia (WP) as a collaborative, dynamical system of humans is an appropriate subject of social studies. Each single action of the members of this society, i.e. editors, is well recorded and accessible. Using the cumulative data of 34 Wikipedias in different languages, we try to characterize and find the universalities and differences in temporal activity patterns of editors. Based on this data, we estimate the geographical distribution of editors for each WP in the globe. Furthermore we also clarify the differences among different groups of WPs, which originate in the variance of cultural and social features of the communities of editors

    Dynamics of conflicts in Wikipedia

    Get PDF
    In this work we study the dynamical features of editorial wars in Wikipedia (WP). Based on our previously established algorithm, we build up samples of controversial and peaceful articles and analyze the temporal characteristics of the activity in these samples. On short time scales, we show that there is a clear correspondence between conflict and burstiness of activity patterns, and that memory effects play an important role in controversies. On long time scales, we identify three distinct developmental patterns for the overall behavior of the articles. We are able to distinguish cases eventually leading to consensus from those cases where a compromise is far from achievable. Finally, we analyze discussion networks and conclude that edit wars are mainly fought by few editors only.Comment: Supporting information adde

    A practical approach to language complexity: a wikipedia case study

    Get PDF
    In this paper we present statistical analysis of English texts from Wikipedia. We try to address the issue of language complexity empirically by comparing the simple English Wikipedia (Simple) to comparable samples of the main English Wikipedia (Main). Simple is supposed to use a more simplified language with a limited vocabulary, and editors are explicitly requested to follow this guideline, yet in practice the vocabulary richness of both samples are at the same level. Detailed analysis of longer units (n-grams of words and part of speech tags) shows that the language of Simple is less complex than that of Main primarily due to the use of shorter sentences, as opposed to drastically simplified syntax or vocabulary. Comparing the two language varieties by the Gunning readability index supports this conclusion. We also report on the topical dependence of language complexity, that is, that the language is more advanced in conceptual articles compared to person-based (biographical) and object-based articles. Finally, we investigate the relation between conflict and language complexity by analyzing the content of the talk pages associated to controversial and peacefully developing articles, concluding that controversy has the effect of reducing language complexity

    Human-machine networks: Towards a typology and profiling framework

    Get PDF
    © Springer International Publishing Switzerland 2016. In this paper we outline an initial typology and framework for the purpose of profiling human-machine networks, that is, collective structures where humans and machines interact to produce synergistic effects. Profiling a humanmachine network along the dimensions of the typology is intended to facilitate access to relevant design knowledge and experience. In this way the profiling of an envisioned or existing human-machine network will both facilitate relevant design discussions and, more importantly, serve to identify the network type. We present experiences and results from two case trials: a crisis management system and a peerto- peer reselling network. Based on the lessons learnt from the case trials we suggest potential benefits and challenges, and point out needed future work
    corecore